Mirex 2013: Discovering Musical Patterns Using Audio Structural Segmentation Techniques

نویسندگان

  • Oriol Nieto
  • Morwaread Farbood
چکیده

This extended abstract discusses our pattern discovery algorithm submitted to the MIREX 2013 Discovery of Repeated Themes & Sections task. This algorithm estimates the musical patterns by finding specific repetitions within a piece and applying certain perceptually inspired rules. Four different versions of the algorithm were submitted: two that take an audio track as an input (monophonic and polyphonic) and two more that take a symbolic representation (monophonic and polyphonic). Each version follows a similar implementation, which is common to the task of audio structural segmentation: convert the music representation into a chromagram (or pitch class profiles), compute the key-invariant self-similarity matrix, and then extract the most prominent repeated segments by analyzing the matrix diagonally. Once the segments have been extracted, they are split into smaller segments if repetition is found within the segments, following perceptual rules regarding pattern length and number of rests. Once these segments meet these requirements, they are considered patterns, and their occurrences are matched using the self-similarity matrix. 1. DESCRIPTION OF THE ALGORITHM For this year’s MIREX submission, we decided to implement an algorithm generic enough that it would be able to run with audio or symbolic music, either monophonic or polyphonic. For evaluation purposes, we submitted the following versions: • NF1: Symbolic, monophonic music. • NF2: Symbolic, polyphonic music. • NF3: Audio, monophonic music. • NF4: Audio, polyphonic music. In the four versions we transform the input into a Chromagram (or Pitch Class Profile vector), a common harmonic representation often used in the task of audio structure segmentation [2, 5] (see Figure 1 for an example of a This document is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. http://creativecommons.org/licenses/by-nc-sa/3.0/ c © 2013 Oriol Nieto & Morwaread Farbood. Chromagram extracted from an audio-polyphonic input). While in the audio versions we use a fixed frame size for the chroma vectors, in the symbolic versions we use the shortest note or rest duration (i.e. the tatum) to determine the frame size. Figure 1. Chromagram of the audio-polyphonic version of Beethoven Op. 2, No. 1, Movement 3. In order to find repetition across a piece, we take the key-invariant Self-Similarity Matrix (SSM) of the Chromagram using the Euclidean distance [3] and find the most similar segments using a modified version of the techniques described in [2] such that we detect the smallest repetition or musical pattern instead of longer musical segments. To do so, we initially detect longer segments and keep splitting them based on the repetition of other similar segments. We set a threshold τ perceptually inspired, as in [4], that is automatically adapted in case no music patterns are found. See Figure 2 for an example of the potential occurrences found in a key-invariance SSM. Once we have found a set of potential patterns and its occurrences, we apply a set of perceptual rules [4] to filter out the patterns that are unlikely to constitute a musical motive (e.g. the amount of silence or rests dominate the pattern). Given this final set of patterns, we use the selfsimilarity matrix in order to find the occurrences which will also be part of the output of the algorithm. 2. RESULTS We use the JKU Patterns Development Dataset 1 to evaluate the four different versions of our algorithm. This dataset contains annotations of five pieces:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mirex 2014 Entry: Music Segmentation Techniques and Greedy Path Finder Algorithm to Discover Musical Patterns

This extended abstract describes the pattern discovery submission to MIREX 2014 of an algorithm that uses music segmentation (or music structure analysis) techniques and a refined greedy method in order to identify the repetitive musical patterns of a given music piece, either represented symbolically or with an actual audio file. We obtain a harmonic representation of the input and compute the...

متن کامل

Identifying Polyphonic Musical Patterns From Audio Recordings Using Music Segmentation Techniques

This paper presents a method for discovering patterns of note collections that repeatedly occur in a piece of music. We assume occurrences of these patterns must appear at least twice across a musical work and that they may contain slight differences in harmony, timbre, or rhythm. We describe an algorithm that makes use of techniques from the music information retrieval task of music segmentati...

متن کامل

Mirex Submissions for Audio Chord Detection (no Training) and Structural Segmentation

This paper describes our approach to chord extraction from audio, a variant of which was submitted to the 2009 MIREX Chord Detection Task (No Training), and achieved the top ranking of 71.2%. The structural segmentation algorithm is a pre-processing step for the chord extraction, and was also submitted separately for the Structural Segmentation Task. It also achieved the top ranking in that cat...

متن کامل

Identifying Polyphonic Patterns from Audio Recordings Using Music Segmentation Techniques

This paper presents a method for discovering patterns of note collections that repeatedly occur in a piece of music. We assume occurrences of these patterns must appear at least twice across a musical work and that they may contain slight differences in harmony, timbre, or rhythm. We describe an algorithm that makes use of techniques from the music information retrieval task of music segmentati...

متن کامل

Automatic chord transcription from audio using computational models of musical context

This thesis is concerned with the automatic transcription of chords from audio, with an emphasis on modern popular music. Musical context such as the key and the structural segmentation aid the interpretation of chords in human beings. In this thesis we propose computational models that integrate such musical context into the automatic chord estimation process. We present a novel dynamic Bayesi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013